Outlier Detection in Multivariate Data

نویسنده

  • K. Senthamarai Kannan
چکیده

The objective of this research is detection of outliers in multivariate data employing various distance measure, particularly using robust regression diagnosis technique. Several classical outlier identification methods are based on the sample mean and covariance matrix in general. But they do not always yield better result, as they themselves are affected by the outliers. Sometimes one outlier point has hide the other outliers. To identify them, methods which have masking effect with outlier points are being used. An appropriate method is adopted to identify the unmasking outliers and also to compare the various distance measures. Mathematics Subject Classification: 62H99, 62J05

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification of outliers types in multivariate time series using genetic algorithm

Multivariate time series data, often, modeled using vector autoregressive moving average (VARMA) model. But presence of outliers can violates the stationary assumption and may lead to wrong modeling, biased estimation of parameters and inaccurate prediction. Thus, detection of these points and how to deal properly with them, especially in relation to modeling and parameter estimation of VARMA m...

متن کامل

Multivariate Outlier Detection Using Independent Component Analysis

The recent developments by considering a rather unexpected application of the theory of Independent component analysis (ICA) found in outlier detection , data clustering and multivariate data visualization etc . Accurate identification of outliers plays an important role in statistical analysis. If classical statistical models are blindly applied to data containing outliers, the results can be ...

متن کامل

Chapter 1 OUTLIER DETECTION

Outlier detection is a primary step in many data-mining applications. We present several methods for outlier detection, while distinguishing between univariate vs. multivariate techniques and parametric vs. nonparametric procedures. In presence of outliers, special attention should be taken to assure the robustness of the used estimators. Outlier detection for data mining is often based on dist...

متن کامل

Outlier Detection Methods in Multivariate Regression Models

Outlier detection statistics based on two models, the case-deletion model and the mean-shift model, are developed in the context of a multivariate linear regression model. These are generalizations of the univariate Cook’s distance and other diagnostic statistics. Approximate distributions of the proposed statistics are also obtained to get suitable cutoff points for significance tests. In addi...

متن کامل

Multivariate outlier detection with compositional data

Multivariate outlier detection is usually based on Mahalanobis distances, by plugging in robust estimates of location and covariance. For compositional data, carrying only relative information, a special transformation needs to be consulted in order to be able to work in the appropriate geometry. The effect of the transformation is discussed in this contribution. Furthermore, different possibil...

متن کامل

Multivariate Outlier Detection and Treatment in Business Surveys

Multivariate outlier detection based on the Mahalanobis distance with the BACON-EEM algorithm, the TRC algorithm and the ER algorithm is presented and imputation of outliers and further missing values is discussed. The methods are illustrated with a data set on Swedish municipalities. The relation between outliers, influential observations and selective editing is explored. Finally robust multi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015